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ABSTRACT 


The Set Covering Problem (SCP) and the Set Partitioning Problem (SPP) 
represent an important class of all-binary (0-1) Integer Linear Programs 
(ILP). A review of the literature reveals extensive application of the 
SPP/SCP model to a wide set of practical problems. The basic model jis 
explained, and then many of the actual applications of this powerful 
model discovered in the literature review are discussed. The problems 
derived from these applications are difficult to solve with any method, 
and are particularly difficult to solve with optimal or exact algorithms. 
Various solution techniques are investigated within the framework of the 
classical simplex method with branch and bound enumeration. Several 
reformulations of the SPP/SCP as Integer Generalized Networks are examined. 
Extensive computational results are reported for several "real world" 
large-scale problems, and a convenient, compact format for data input 


is proposed as a standard for this problem class. 
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I. INTRODUCTION 


The Set Covering Problem (SCP) and the Set Partitioning Problem (SPP) 
represent an important class of all-binary (0-1) Integer Linear Programs 
(ILP's). These problems have binary variables, binary constraint 
coefficients and unit or integer resources. 

The basic SPP/SCP model has been known for over 25 years. It is 
enticing in formulation and deceptively simple. A review of the open 
literature reveals extensive application of the SPP/SCP model to a wide 
range of problems, including airline crew scheduling, vehicle routing, 
and facilities location. Even though the model has been intensively 
studied for both its intriguing binary structure and its potential for 
practical application, exact solution technologies for large-scale 
problems were not evident until the work of such researchers as Marsten 
[Ref. 1] began to appear in the early 1970's. Other early contributors 
are listed by Christofides [Ref. 2]. 

After first defining the basic model and discussing many of its 
applications, several reformulations of the SPP/SCP will be examined. 
Glover and Mulvey [Ref. 3] have presented two reformulations of the 
binary ILP as an Integer Generalized Network (GNIP). There is very 
little computational evidence in the literature concerning these reformu- 
lations; therefore, the computational behavior of this approach will be 
tested and results reported for several problems. Another reformulation 
of the SSP/SCP as an Integer Generalized Processing Network (a network 


with special side constraints) will be examined and its potential evaluated. 
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As indicated by the many reformulations, manipulative techniques, and 
heuristic methods appearing in the literature, these problems are diffi- 
cult to solve reliably with any method, and are particularly difficult to 
solve with optimal or exact algorithms. Various solution techniques 
based on the classical simplex method with branch and bound enumeration 
are investigated in this study. Some of the techniques examined are 
basis factorization, elastic programming, enumeration schemes, network and 
linear programming relaxation, logical reduction, and heuristic methods 
for obtaining starting solutions. Extensive computational results are 
reported for several “real world" large-scale problems, and a convenient, 
compact format for data input is proposed as a standard for this problem 


class. 
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IT. THE BASIC MODEL AND MODEL GENERATION 


A. THE BASIC MODEL 


The SCP formulated as an ILP is of the form: 


n 
(1) MIN CX, 
J jee 

n 
(2) Sate 0S a5 545 Z MWe i = ie Sore; tl 

j=l 
(3) x | (On oe Jn 
(4) oe > 0 J =e. rerergrall 
(5) D. > 0 and integer i = 1, ..., m 

1 if Column Covers «cow el 

(6) as 


O otherwise. 
A minimal cost set of columns must be selected from A such that 


the magnitudes of the right-hand sides (RHS), b., are "covered" or 


j? 


Satisfied. If (2) and (5) are replaced by 


nN 
(7) 4 ajj4; = 1, i=l, o scergeelllie 


we have the SPP (sometimes referred to in the literature as the equality 
constrained SCP). This restriction of the SCP exhibits sufficient 
modelling and computational interest to be studied in its own right. For 
the SPP, the rows {i} represent a set which must be partitioned by a 


combination of mutually exclusive columns at minimum cost. 


WZ 





Bae otOE CONDITIONS 

Many practical applications of the SPP/SCP formulation add logical 
constraints to the basic model discussed above. For example, Suppose 
there are p sets of columns Sp kK = 1, ..., p in the model and only one 
column from each of the sets Sy 1s eligible to be included in the final 
solution. This restriction will produce constraints 

(8) Do xX. =-1 for all k. 

eo 

In another case, suppose that the solution must include exactly J columns. 


This results in the cardinality constraint 
(9) 2X = 
j J 


being appended to the basic model. Introductory modelling texts such as 
Wagner [Ref. 4] and Gaver and Thompson [Ref. 5] discuss many such logical 
conditions formulated with binary variables. Any or all of these logical 
conditions can be included to extend the basic model for the purpose 


required. 


C. COLUMN GENERATION 

The art of formulating the practical SPP/SCP lies in the schemes used 
for column generation. It is possible, of course, to generate al LS ea | 
columns capable of covering or partitioning the rows, but for any relatively 
large number of rows, the problem becomes intractable. This "all possible 
combinations" formulation is known as the Complete SPP/SCP, and even 
though techniques are emerging for attacking such problems [Ref. 6], 
efforts must be made to keep the number of permissible columns within 
the capabilities of the optimizer being used. Editing reductions can be 


Ie 





realized by incorporating such conditions as managerial specifications of 
operating policy; dimensional restrictions on time, distance, and space; 
legal restrictions; labor union restrictions; cash flow restrictions; 
environmental restrictions; and as many other "real world" constraints as 
can be included in the column generation process. 

Incorporating such conditions into the column generator can handle 
most, if not all, side conditions and feasibility issues without including 
them as extensions of the basic SPP/SCP model. Some examples are described 
by Marsten and Muller [Ref. 7], Shanker, Turner, and Zoltners [Ref. 8], 


and Cullen, Jarvis, and Ratliff [Ref. 9}. 


D. THE OBJECTIVE FUNCTION 

The cost coefficients C; for the basic model can be of two types: 
physical and ordinal. A physical cost is a coefficient in units of 
dollars, miles, time, etc., and represents the cost of covering certain 
rows with column j. The associated physical objective function expresses 
the cost of covering or partitioning the set represented by the rows. 

It 1s quite often the case, though, that the cost coefficients serve 
only as a means of distinguishing between alternate columns. [n many 
political and social models, for example, a column will be assigned 3 
PHGGREt ive number depicting some measure of acceptability (or unaccept- 
ability) thus effecting an ordinal ranking structure in the objective 
function. The objective then becomes a matter of selecting those columns 
which minimize the ordinality. A much-used special case of the ordinal 
cost structure is the unit-cost objective function in which C = | for 


all j. The optimal solution for the unit-cost objective function is the 
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minimum number of columns capable of covering or partitioning the row set 


without regard to physical cost or ordinal ranking. 
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A. 


because of their many practical applications. 


ITT. APPEICATIONS AND PROBLEM DESCRIPT DONG 


APPLICATIONS 


Set Covering and Set Partitioning Problems have been studied extensively 


The surveys by Garfinkel 


and Nemhauser [Ref. 10] and Balas and Padberg [Ref. 11] list many useful 


applications which have appeared in the literature. 


Some of these are 


listed below, along with a few which have subsequently appeared. 


LG, 
ive 


sla 
Se, 


APPLICATION 


Truck Deliveries 


Tanker Routing 


Aircrew Scheduling 


Facilities Location 


Air Fleet Scheduling 
List Selection 
Political Districting 


Nuclear and Conventional 
Targeting 


Information Retrieval 
Symbolic Logic 


Switching Theory 


Stock Cutting or Trimming 


Line Balancing 
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APPLICATION REFERENCE 


14. Capacity Balancing [Ref. 43].* 

15. PERT-CPM [Ref. 36].* 

16. Frequency Allocation [Ref. 44}. 

17. Tracking Problems [Ref. 45]. 

18. Vehicle Routing (Ref. 9]. 

19. Sales Territory Design (Ref. 8}. 

20> Coloring Problems [Ref. 46],* [Ref. 47], [Ref. 48].* 


21. Disconnecting Paths in a Graph ([Ref. 49], [Ref. 50]. 


22. Cyclic Scheduling Problems [Retassiieeiner. $2: 


8. THE TRUCK DELIVERY PROBLEM 

The first application we will examine is one that appears quite often 
in textbooks and is a simple illustration of the basic model. This 
problem will also provide an example which will be carried forward 
through discussion in later sections. 

Consider the problem of making deliveries to m locations by truck 
(rail, aircraft, ship, messenger, etc.). There are n feasible routes to 


choose from and a;. = 1 if location i is on route j. A cost C; (say, 


J 
time, dollars, miles) is assigned to route j. An optimal dartition gives 
a minimal cost routing that makes each delivery exactly once. An optimal 
cover gives a minimal cost routing that makes sufficient deliveries to 
Satisfy the demand at each location. The optimal solution to the unit- 
cost problem yields the minimum number of trucks necessary to make the 
required deliveries. 

Table 1 is the explicit tableau for an illustrative example of the 


SPP. The flight scheduler for a small West Coast air freight company nas 


ii 





TABLE 1. AIR FREIGHT EXAMPLE 


Los Angeles | l 0 0 0 0 0 | = | : 
San Francisco | l l i 0 0 0 0 : = ] : 
San Jose | i l l 0 0 0 0 | = el | 
Denver : 0 0 1 1 1 0 0 | = ] : 
Portland : 0 0 0 l l 0 0 : = | | 
Seattle : 0 0 0 l \ 1 0 | = 1 : 
San Diego : 0 0 0 0 l 1 I : =] : 

| | | 
Costs 0 0 0 0 6 7 4 | obj. 


been assigned the task of delivering exactly one of seven identical 
packages to each of seven western cities by tomorrow morning. AIl of the 
delivery points can be reached in the required time with the current 
schedule except San Diego. There are only three feasible ways to make 
the San Diego delivery: extend route 5, extend route 6, or add a new 
route 7. The cost of the alternatives is calculated and appears in the 
tableau. By inspection, there are only two feasible solutions: (1) 
Routes 1 and 5 at a cost of 6, and (2) Routes 1, 4, and 7 at a cost of 
4. The minimum cost solution, therefore, is to add flight 7 to the 
Current schedule. The unit-cost solution or minimum partition is to use 
solution (1). 

Two large-scale, real-life problems of this type have been examined 
in this study: TRUCK and TANKER. TRUCK is a nationwide, intercity truck 


routing problem, a large SCP, and fits the basic description above. 
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TANKER is a worldwide oi] tanker fleet scheduling problem which extends 
the basic SPP model to help choose between company-owned and charter 
tankers to meet refinery delivery requirements from available loading 
volumes and origins. Each cargo, company-owned ship, and potential 
charter vessel is represented by a row. Cargoes must be carried, and 
ships must either be used, or scheduled in demurrage. Each column 
represents a feasible route for a particular ship; during the planning 
horizon, it may carry zero, or more cargoes. The cost of each route may 
be calculated ordinally (based on fleet size) or economically (based on 
operating costs). 

The problem dimensions for TRUCK and TANKER are listed in Table 2. 
NZEL is the total number of non-zero elements, and NCE is the average 


number of non-zero elements per column. 


TABLE 2. TRUCK DELIVERY PROBLEM DIMENSIONS 


ROWS COLUMNS NZEL NCE MODEL 
| | 
TRUCK | 239 4752 30075 8.0 Sop 
| | 
TANKER | 166 [sieve oZs9 4.1 SPP { 
| | 


C. THE AIRCREW SCHEDULING PROBLEM 

An airline has a set of m flight legs, each of which requires a crew. 
Given the airline's timetable, a set of n possible crew rotations can be 
generated. Each crew rotation is a sequence of scheduled flight seaments 
constituting a roundtrip (a sequence departing from and returning to one 


of the airline's crew bases). A cost is calculated for each rotation, 
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and once a complete set of flyable rotations has been generated, the 
problem is to select an optimal, feasible subset. 

Depending on whether or not crew members are allowed to be passengers 
on certain flights, optimal covers or optimal partitions yield optimal 
schedules. "Deadheading" is the practice of allowing a crew to travel as 
passengers on certain flights. Planned deadheading can be accommodated 
with the partitioning model. If rotation j concludes a planned deadhead 
on flight segment i, then as is set to zero rather than one. If 
unplanned deadheading is allowed, however, then a covering problem must 
be solved. 

A typical side condition common to these models is the Crew Base 
Constraint of the form 

oo. HX = Mo OleoL =: lee ees 
where - is the number of flying hours, per month, associated with 
wewecvion Jj; Me is the maximum number of flying hours available per 
month at crew base s; and D. is the set of rotations flown out of crew 
base s. 

All of the problems of this type were provided by Professor Roy —. Marsten, 
University of Arizona, and are described in [Ref. 23]. TIGER] and TIGER2 
are examples of crew scheduling problems generated for Flying Tiger 
Airlines. AMERICAN is a large crew scheduling problem generated by 
American Airlines. (All of the airline problems in this study were solved 
without crew base constraints.) The last problem in this class is BUS, 4a 
driver-scheduling problem generated for Helsinki City Transport as 
described in [Ref. 23]. 
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TABLE 3. ATRLINE CREW SCHEDULING PROBLEM DIMENSIONS 


ROWS COLUMNS NZEL NCE MODEL 
| | 
TIGERL | 160 636 4134 Om See | 
HGR : 107 2188 8266 3.8 Sle : 
AMERICAN : 95 9318 DI eIe Gry! SPP : 
BUS 56 930 3365 6.0 See 


D. THE MAXIMAL SET COVERING PROBLEM 

Either the facilities location problem or the list selection problem 
can be formulated as a Maximal Set Covering Problem. This problem 
differs from the basic SCP because we no longer require that al! rows be 
covered, rather the objective is to cover as many rows as possible 
subject to various constraints. To accomplish this, m continuous variables, 
Ye are added to the basic model to produce the following Mixed Integer 
Program (MIP): 


Min DY; 


] j=] 


nN 
ST. 2, aks +¥;>1 i=], -.-,m 


n 
(10) uP Graaccal 


>< 
m 
r oom’ 
© 
~ 
_~ 
we 
qc 
Hh} 
_— 
w 
vy 
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The constraint (9) limits the number of rows which may be covered by 
specifying that only J columns can be used. The constraint (10) is a 
budget constraint which specifies that as many rows as possible be 
covered for 8 dollars. The sense of the objective function here is to 
minimize the number of rows left uncovered. 

Another formulation in the same spirit replaces the objective function 
by the familiar Min 2 U Xs, and adds the constraint 


J 


m 
eae aD 
i=l 


This formulation seeks the minimum cost set of columns which leaves at 
most M rows uncovered. 

Dwyer and Evans [Ref. 30] have applied a similar formulation to the 
"list selection problem." The list selection problem selects a set of 
Subscriber lists which maximizes the proportion of customers reachable 
with direct mail pieces. The rows correspond to magazine subscribers, 
and the columns to individual magazines. Let ai3 = 1 if individual i 
Subscribes to magazine j, and zero otherwise. 

Moore and Revelle [Ref. 28] have applied this formulation to a 
nierarchical facilities location problem. The rows represent demand 
points and the columns represent various location stratedqies. The objective 
1s to pick those strategies which cover as much of the demand as possible. 

STEINER] and STEINER2 are two computationally difficult set covering 
problems published by Fulkerson, Nemhauser, and Trotter [Ref. 53]. Each 
row has exactly three non-zero elements, 5. = 1 for all i, and the 


: 


objective function is of the unit-cost type. These basic SCP's were 
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extended to MCOVER1 and MCOVER2 respectively, in order to evaluate the 


difficulty of the Maximal Covering Problem. 


TABLE 4. MAXIMAL COVERING PROBLEM DIMENSIONS 


ROWS COLUMNS NZEL NCE MODEL 
| | 
SHEINERL | 117 2/ 351 13 SCP | 
MCOVERI : 118 144 495 13 SCP (ext) : 
STEINER2 : 330 45 990 Ze SUR ! 
MCOVER2 | 331; gy 5 P05 Lia SCP (ext) 


Only the budget constraint (10) was added to produce the extended 
problems. The value of B was set so that MCOVER1 seeks the same 
optimal solution as STEINER1, and MCOVER2 seeks the same optimal solution 
as STEINER2. 
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IV. COMPUTATIONAL DIFFICULTIES 


A. THE BRANCH AND BOUND ENUMERATION METHOD 

It is evident that the SCP/SPP is a powerful model with many useful 
applications. Unfortunately, it is also true that the large-scale 
SCP/SPP is difficult to solve to optimality. In fact, Karp [Ref. 54] has 
Shown the set covering problem to be an NP-hard combinatorial problem. 
The solution techniques investigated here involve simplex-based enumeration, 
often called branch and bound. | 

Branch and bound is an enumerative method that has been used success- 
fully to optimize a variety of combinatorial problems. The basic prin- 
ciple is to methodically search the set of possible integer solutions 
in such a way that not all possibilities need be explicitly considered. 
The theoretical framework for this study is provided in the following, 
which has been adapted from Geoffrion and Marsten [Ref. 10]. The procedure 
of branch and bound is described in terms of three concepts: separation, 
relaxation, and fathoming. 

1. Separation 

For any optimization problem (P), let F(P) denote its set of 

feasible solutions. Problem (P) is said to be separated into subproblems 
if the following conditions hold: 


1. Every feasible solution of (P) is a feasible solution of exactly 
one of the subproblems. 


2. A feasible solution of any of the subproblems is a solution 
One). 
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The procedure is to first make a reasonable effort to solve (P). 
If this effort is unsuccessful, separate (P) into two subproblems, 
thereby initiating what will be called a candidate list of subproblems. 

A reasonable representation of the candidate list may be an “enumeration 
tree" which reveals the partial ordering of consideration among candidates 
in the list. Extract one of the subproblems from the list and call it 

the current candidate problem (CP). If (CP) can be solved, extract a new 
candidate problem from the list; otherwise, separate (CP) and add its 
"descendants" to the candidate list. Continue in this fashion until the 
candidate list is exhausted (i.¢., every branch of the enumeration tree 
has been examined). If we refer to the best solution found so far to any 
candidate problem as the current incumbent, then the final incumbent must 
obviously be an optimal solution of (P). 

The technique of separation involves "branching" on a single 
integer variable. For the SPP/SCP where x, is declared to be a binary 
variable, the ILP can be separated into two subproblems by means of the 
mutually exclusive and exhaustive restrictions Ag = 0 or x; = 1. An 
enumeration tree may be visualized with a vertex associated with each 
Separation and an edge with each restriction. The tree predecessor 
relationship among vertices reveals the ordering among separations and 
their associated restrictions. This enumeration tree provides a visually 
appealing illustration of the solution sequence. 

2. Relaxation 

Any constrained optimization problem (P) can be “relaxed” by 

loosening its constraints, resulting in a new prob lem (Po). By far the 


most popular type of relaxation for the ILP is to replace the integrality 
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restriction on the variables of (P) by simple bounds on the variables, 
producing the continuous problem (Pp). The only requirement for (Pp) 
to be a valid relaxation is that F(P) CF(P,). For the minimization 
problem, this definition implies: 
Ce 6 2 (Po) has no feasible solutions, then the same is true of (P). 
2. The minimal value of (P) is no less than the minimal value of (Po). 


3. If an optimal solution of (P,) is feasible in (P), then it is an 
optimal solution of (P). 


In selecting between the alternative types of relaxation for a 
given problem, there are two main criteria to be considered. First, it 
1s desirable for the relaxed problem to be significantly easier to solve 
than the original. Second, one would like (P.) to yield an optimal 
solution of (P), or, failing that, the minimal value of (Pp) should be 
as close as possible to that of (P). The distance between the minimal 
values of (Pp) and (P) is often described as the "gap," and is used as 
a measure of the "strength" (small gap) or "weakness" (large gap) of the 
relaxation (Po). Unfortunately, the objectives that (Pa) be both 
"strong" and easy to solve are antagonistic. In general, the easier (Pp) 
is to solve, the greater the "gap" is between the original and relaxed 
problems. 

3. Fathoming 

Let (CP) be a typical candidate problem arising from the attempt 
to solve (P). The ultimate objective in dealing with (CP) is to determine 
whether its feasible region F(CP) may possibly contain an optimal solution 
of (P), and if it does, to find it. If it can be ascertained by some 
means that F(CP) cannot contain a feasible solution better than the 
incumbent (the best feasible solution yet found), this is certainly 
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good enough to dismiss (CP) from further consideration, and we say that 
(CP) has been fathomed. If an optimal solution of (CP) can actually be 
found, we also say that (CP) has been fathomed. In either case, the 
candidate problem has been entirely resolved for purposes of enumeration, 
and no further separations of (CP) are necessary. Thus, the subproblems 
which would arise as restricted descendants of (CP) have been enumerated 
implicitly by either the bounding argument or the feasibility argument. 
Candidate problem (CP) is fathomed if any one of these criteria 
is satisfied: 
1. An analysis of (CP) reveals that (CP) has no feasible solution. 


2. An analysis of (CP) reveals that (CP) has no feasible solution 
better than the incumbent. 


3. An analysis of (CP) reveals an optimal solution of (CP); e.g., an 
optimal solution of (CP. ) is found which happens to be feasible in (CP). 


4. General Tree Search Procedure 


STEP 1: Initialize the candidate list with the ILP. Set the incumbent 
value, Z*, equal to infinity. 


STEP 2: STOP if the candidate list is empty. If there exists an incumbent 
then it must be optimal in the ILP, otherwise ILP has no feasible 
solution. 


STEP 3: Select a candidate problem (CP) from the list and solve its 
relaxation (CPp). 


STEP 4: Fathoming Criterion 1. If the outcome of STEP 3 reveals (CP) 
compe anreasiDie, go to STEP 2. 


STEP 5: Fathoming Criterion 2. If the outcome of STEP 3 reveals (CP) 
has no feasible solution better than the incumbent, Z*, go to 
SER 2 . 


STEP 6: Fathoming Criterion 3. If the outcome of STEP 3 reveals an 
Soeimal salution of (CP), go to STEP 8. 


STEP 7: Separate (CP) and add its descendants to the candidate list. 
SOmeOMO VE 2). 
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STEP 8: A feasible solution of ILP has been found. If the value of the 
(CP) is less than Z*, record this solution as the new incumbent 
and set Z* = value of (CP). Go to STEP 2. 

The degrees of freedom in STEP 3 provide a host of options. 
Critical among these is the selection mechanism for branch variables. A 
good branching strategy makes it possible to avoid searching large 
portions of the enumeration tree, thus greatly reducing time spent in the 
enumeration process. STEP 3 can also be prohibitively expensive if the 
solution of (CP. ) is not easy to generate from the solution of (CP). 

This step requires either storage of many (CP) solutions or a restriction 
in the sequence for branch solutions. For instance, "fixed order enumera- 
tion" permits branching only on the last element in the candidate list 
previously associated with a (CP). 

Most successful implementations of this general scheme use the 
solution of the LP relaxation of the ILP to obtain the bounds required 
for the branch and bound enumeration. Christofides [Ref. 2] and Marsten 
(Ref. 1] report that most of the current large-scale algorithms use LP to 
obtain bounds for their various enumeration procedures. Exceptions are 
Etcheberry [Ref. 55], who uses "Lagrangian Relaxation” to obtain the bounds, 
and Glover and Mulvey [Ref. 3], who reformulate and use General ized 
Network Relaxations. 

The success of the branch and bound scheme depends on good 
branching strategies and the ability to obtain good bounds efficiently 
during the tree search. Typically, many LP's must be solved and even 
though the integer requirement is relaxed for each LP restriction, there 
are two serious problems associated with these LP's which make them hard 


to solve: numerical instability and degeneracy. 
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B. NUMERICAL INSTABILITY 

The concept of a basis for the linear program must first be discussed 
In order to explain the computational difficulties. Consider the system 
of equalities AX = b where X is an m-vector, b an m-vector, and A is an 
m xm matrix (m<n). From the m columns of A, we select a set of m 
linearly independent columns and denote the m x m matrix determined by 
these columns by B. The matrix B is then non-singular and we may uniquely 
solve the equations BX. = b for the m-vector Xp » namely, Xp = po lb, 
If all m - m components of X not associated with columns of B are set 
to equal zero, the solution to the resulting set of equations is said to 
be a basic solution to AX = b with respect to the basis B. B is called a 
basis since its m linearly independent columns span the space eos 

It can be seen from the above explanation that the transformation by 
the basis inverse is necessary to obtain a basic solution. It is also 
true that all large-scale LP systems available today require some form of 
representation of this basis inverse transformation in order to function 
efficiently. One popular representation is the Product Form of the 
Inverse described by Orchard-Hays [Ref. 56]. Another example is the 
explicit sub-kernel representation described by Graves [Ref. 57]. 

Unfortunately, the columns of the SPP/SCP are often nearly linearly 
dependent. For instance, a route generator will produce a base route to, 
say, five locations. By substituting alternate locations one at a time 
into the base route, many routes are generated which differ by only one 
or two elements. This can produce an ill-conditioned basis whose inverse 
can contain numbers so large, or so small that after a few iterations 


with real arithmetic, the computer is unable to maintain sufficient 
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Significance to provide the numerical stability necessary for the LP 
algorithm to converge, or if it does converge, to produce the true 
optimum. 

To attempt to overcome the numerical instability, it is necessary to 
"Clean up" the representation of the inverse by a process known as 
reinversion. There are many different reinversion schemes available, but 
in essence, they all use the original problem data to generate a new 
representation of the inverse which is relatively free from accumulated 
round-off error. Reinversion is computationally expensive, and for the 
SPP/SCP it is often necessary to reinvert quite frequently, thus slowing 


the computation of the bounds needed by the enumeration scheme. 


C. DEGENERACY 

The primal simplex algorithm for the solution of the LP proceeds from 
one basic feasible solution of the constraint set of a problem to another 
in such away as to continually decrease the value of the objective 
function until a minimum is reached. If one or more of the basic variables 
in a basic solution has value zero, that solution is said to be a degenerate 
basic solution. For the SPP/SCP, it is important to note that seeking 
the minimum number of columns capable of covering or partitioning the row 
set is exactly equivalent to maztmtzing the primal degeneracy present in 
the optimal basic solution. 

"Pivoting" is the name applied to the procedure which accomplishes a 
basis exchange. A degenerate basis exchange is one in which a column 
leaves the basis, a new column enters the basis, and the value of the 
objective function does not change. A degenerate pivot, then, exhibits 
the undesirable property that the basis exchange uses up computation time, 
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but no overt improvement in the objective function is realized. (We will 
ignore for the moment that we face a serious theoretical dilemma in 
demonstrating that the simplex method is finite in the presence of 
degeneracy.) Every pivot involves the update of the basis inverse 
representation; therefore, each update usually introduces round-off 
error. As discussed earlier, the basis inverse transformation for an 
ill-conditioned basis accumulates round-off error very quickly. In the 
presence of massive degeneracy, then, it is possible for the convergence 
of the primal simplex algorithm to be prohibitively slow, because an 
excessive amount of time is spent making degenerate basis exchanges and 
performing reinversion. 

Degeneracy and round-off error can also produce a very serious 
phenomenon called "cycling." It is possible that a repeating sequence of 
degenerate basic solutions will be generated such that the simplex 
algorithm cycles endlessly without making progress. Most LP systems 
ignore the threat of cycling, because the repeating sequence is usually 
broken after reinversion "randomly" permutes the row order, thus evoking 
a new solution trajectory. If, however, significant time is Spent in a 
cycle while waiting for reinversion to be triggered by the pivot count, 
the internal clock, or Dy a check on the rounding error, rapid solution 
Sieene LP wil! not be possible. 

It has been determined that degeneracy and consequent cycling are 
Significant obstacles for the efficient solution of the LP relaxation of 
the SPP/SCP with most of the available LP systems. Massive primal 
degeneracy is present as a consequence of the binary coefficients and the 


fact that for most SP9/SCP's, the right-hand sides of each row are 
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identical. It is this massive primal degeneracy which led Marsten to 
Suggest the use of a dual algorithm for the solution of the LP [Ref. 1]. 
For the unit-cost objective function, a similar dual degeneracy can also 
be present. Although in most problems with general costs, the dual 
degeneracy is less severe than the primal degeneracy, even objective 
functions with varying costs can produce an effective degeneracy due to 


round-off error. The problem called TRUCK exhibits these characteristics. 
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V.  REFORMULATIONS 


A. THE NEED FOR REFORMULATION 

The LP relaxation of the SPP/SCP can be numerically troublesome. 
One way to avoid this difficulty is to seek another relaxation which may 
be easier to solve. The alternate relaxations examined here are based on 
networks. Reformulation of the SPP/SCP as a network makes it possible to 
exploit an efficient solution technology. Network codes such as GNET 
[Ref. 58], and GENNET [Ref. 59] use basis handling procedures which 
require very little real arithmetic, thus avoiding much of the round-off 
error problem. Reformulation comes at the cost of making the problem 
larger, but it is hoped that the superior speed and numerical stability 
of the network approach will more than make up for the increase in 


problem size. 


B. THE FIRST GENERALIZED NETWORK REFORMULATION 

Glover, Hultz, Klingman, and Stutz [Ref. 60] have offered an interesting 
reformulation of any all-binary integer problem as an integer generalized 
network. The Generalized Network formulation is attractive because of 
the emergence of some very fast computer codes for solving generalized 
networks. Glover, et al., report that their code, NetG, is up to 50 times 
faster than state-of-the-art commercial LP codes on continuous network 
problems. GENNET has proved to be comparable in solution speed for the 
Same class of problems. This reformulation, then, (which we will call 


GNIP-1) offers some promise for the solution of the SPP/SCP. It also 
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provides a way of describing the SPP/SCP in network terms which can make 


it easier to formulate and explain the model. 


The SPP 
n 
(1) MIN ~ C.X, 
J T=). 
n 
(2S ot. 2, a, 3X, = J ee ee area || 
J 
) K; € {0,1} jg2 eee 
(4) C; DAA aes ole srerevyn tl 
1 if column j covers row i, 
(5\) ij a 


O otherwise 


reformulated as a Generalized Network becomes the GNIP-1 





MIN =U CX 
kK 


~ MX + Y, = 0 ee oe on 
K:head j Kents he] 
(12) i) = 1 male =m 
cho ; * 
Ky elo. 1} 
0< eS iL 


For the SCP, (12) would be replaced by >> Wh ae 
k:head j 


The procedure for drawing the network flow diagram is given below. 
Given a SPP(SCP) with m rows, n columns, and NCE(j) non-zero elements 


per column, 





I. Create a node i for each constraint, i = 1, ..., M3; and give each 
node a demand of 1 (SCP 2 6;). 

2. (Create anode j for each variable, j = 1, 
supply = Q. 


., N3; where demand = 


3. Create a super source node S and give it a supply 2.0. 


4, ene a generalized arc X. (S,j) for each original variable, 
5 4 rr 6 ie 


a. Assign arc Xe a cost of C;. 
b. Designate arc A, as an integer (0 - 1) arc. 
c. Give arc Xv a multiplier M, equal to NCE(j). 


5. Create a pure network arc Vy (j,i) for each non-zero element in 
column j. 


a. Assign arc if a-COSt Of Zenon 
b. Assign arc Y, an upper bound of one and a lower bound of Zero. 


The GNIP-1 reformulation of the Air Freight Example Problem is displayed 
in Figure l. 
It can be seen in Figure 1 that if the flow on a generalized arc A tae ; 
1s zero, the flows on the continuous arcs Ye emanating from the variable 
node j are also zero. It is also clear that if the flow on the generalized 
Bae is 1, a flow of My arrives at the variable node, forcing a flow of 
l on each continuous arc incident to that node. 
The telling disadvantage of the GN reformulation is the weakness of 
its continuous relaxation. When the integer restriction is removed for 
the arcs Xv there is no assurance that an integral flow will arrive at 
the variable node. So far, this is comparable to the LP relaxation of 
the integer variable Ks - The difference between the LP relaxation and 
GN relaxation lies in the continuous arcs a emanating from the variable 
node. Given a fractional supply, there is no assurance that the flows on 
each continuous arc will be the same. If the flows were identical, the 
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ARC PARAMETERS 
(Ck, Mk) DEMAND 


* = Integer (0, 1) (0,1) 


= 1 


Ky 





a 


VARIABLE NODES (j) CONSTRAINT NODES (i) 


Figure 1. GNIP-1 Reformulation of the Air Freight Example 
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GN relaxation would be the same as the LP relaxation. Unfortunately, 
empirical evidence suggests that the flows differ widely. Furthermore, 
the problem increases as a function of the number of non-zero elements 
per column in the original ILP. Srinivasan and Thompson [Ref. 61] report 
a practical upper bound of three or four non-zeros per column for a 
similar reformulation of set partitioning problems. 

Glover and Mulvey [Ref. 3] state that it is legitimate to manipulate 
the costs incident to a given variable node provided that these costs 
always sum to C.. This can be interpreted as a form of "Lagrangian" 
manipulation, taking side constraints into the objective function, where 
these side constraints stipulate that the flow on each pure network arc 
incident to each variable node be the same. By linear programming 
duality, there exists some such assignment of costs for which the optimum 
objective function value for the GN is the same as that for the LP 
relaxation. Obviously, the trick is to find an exact or heuristic 
procedure of assigning these costs to "strengthen" the GN relaxation. 
Several attempts were made to distribute costs according to the proportion 
of flows on the first and subsequent GN relaxations, dut these attempts 


proved ineffective. 


C. THE SECOND GENERALIZED NETWORK REFORMULAT ION 

One way to strengthen the GN relaxation is to reduce the number 
of continuous arcs in the reformulation. Glover and Mulvey [Ref. 3] 
have presented another GN formulation for the ILP (GNIP-2) which elimin- 
ates the super source node, the n generalized arcs emanating from it, and 
One continuous arc per variable node. For the ILP with m rows, n columns, 
and NCE(j) non-zero elements per column, 


aif, 





1. Create a node 1 for each constraint, i = 1, ..., ms; and give each 
a supply of one (SCP > b,). 


2. Create a node j for each variable, j = 1, ..., n3 where demand = 
supply = Q. 


3. Create an arc (i,j) for each non-zero element in the ILP, connecting 
each constraint node to the appropriate variable node. 


a. Select one arc for each j and designate it as an integer (0-1) 
generalized arc Kye 


Plime nSSIGn dre XxX, a Cost OfeGe, 
(2) Assign arc Xv a multiplier of rs NCE(j) - 1 


(if NCE is greater than one). 
b. Designate the remaining arcs as continuous generalized arcs Yee 
(1) Assign arc Y, an upper bound of l. 
(2) Assign arc Yy a cost of zero. 
(3) Assign arc ie a multiplier of ae -l. 
emer NEC |) = 1 


(1) Create a slack node S with a supply < M. 
(2) Create a continuous arc Yy (S,j) as in 3b above. 


The GNIP-2 reformulation of the Air Freight Example is displayed in 
migure 2. 


The above procedure produces the following mathematical program GNIP-c: 


MIN dL SLX, 


K 
Se » MX, + ye SY. = 0s, oe 
k:head j eK head J ‘ 
(13) ee ie a ee 
Reta 1 Ketan. 
Xv eos 
06% <1 


For the SCP, (13) would be replaced by 9, Wea, 12 tes 
k:tail j kta ol 
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(Ck, Mk) 





(ONZE 


Cs) (7,1)* (sea) ae 
yen 
- ee (80) aon 
) Or 
S 


*x = Integer 


Figure 2. GNIP-2 Reformulation of the Air Freight Example 
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Table 5 shows the relative strengths of the GN and LP relaxations, 
plus relevant problem dimensions for a few of the test problems. The 
column labeled TIME OF LP/GN displays the time required to obtain the 
first solution of the LP/GN relaxations, given in IBM 3033 CPU seconds. 

A special version of GENNET [Ref. 59] was used to obtain the GN relaxations, 
and the current version of the X System [Ref. 62] was used for the LP 
relaxations. The GNIP-1 relaxations for a few problems were obtained 

with the X System so comparative times will not be given. %OPT is the 
value of the continuous relaxation multiplied by 100 and then divided by 
the value of the optimal integer solution. The closer %OPT is to 100, 

the better the relaxation. 

STEINERIA and STEINER2A are two SCP's created by transposing the 
coefficient matrices of STEINER] and STEINER2 problems. These problems 
were constructed because no real problems were available having fewer 
than four non-zero elements per column. The GN relaxations have been 
reported to be non-competitive with the LP relaxations when the number of 
non-zeros per column exceeds three or four [Refs. 61]. The results from 
this study indicate that the strength of the GN relaxation is unpredictable. 
The GN relaxations for the symmetrical STEINER problems are very competi- 
tive with the LP relaxations. The GN relaxation for BUS, however, was 
unexpectedly bad. The gap for BUS is so great that there is little hope 
of a reasonable solution trajectory in the enumeration phase. Even the 
relatively easy problems, TIGER1 and TIGER2, produced GN relaxations of 


poor quality. 
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TABLE 5. GENERALIZED NETWORK RELAXATIONS FOR SELECTED PROBLEMS 


ROWS/ COLS/. —s AVG-~—S XIE OF 
MODEL = NODES ~— ARCS NCE LP/Gn 20 T 


| | | | | | | 

eC el O7ee lee ieesa0) laeGsees \ulnoeON 

STEINERIA | Gurp-o ] 144 | 351 | 1 0.09 | 100.0 | 
| | | | | | | 

| SCP | 4 | 330 | | | ; 

5 0 3.0 1.14 | 100.0 | 

STEINERZA | Gurp-2 | 375 | 990 | i O01 oo. 1 
| : | | | | | 

| | | | | 

a> a eeeliizale 27) ceo OMecee eisORON | 

STEINERL =) Gnrp-2 | 144 | 351 | Mee | aoa I 
| | | | | | | 

ee a 

icon eccOuMee 25) 22K) imeORScileSCmamn | 

STEINER2 =) Gurp-2 | 375 | 990 | | ae | aot 
| | | | | | | 

| | | | | | | 

mescpemeles. 56) San 1 8.18 | 82.8 | 

BUS | GNIP-1 | 587 | 3894 | 6.3 | -- | 0.0014 | 
auip-2 9) 587 | 3372 | 1 1.18 | 0.0023 | 

: | | | | | | 

| spp | 160 | 636 | 1 0.94 | 100.0 | 

TIGER ene I ey Ie se Tegel 55 ey ee 
aces | 797 | aco | Ve le eet 

: | | | | | : 

ia oe | | 

— 1 spp | 107 | 2188 | 4.3 | 9.46 | 100.0 | 
| GNIP-2 | 2296 | 8292 | 2.0 | 3.90 | 46.3. | 

| | | | 


D. THE GENERALIZED PROCESS NETWORK REFORMULATION 

Some recent work by Koene [Ref. 63] on Processing Networks offers an 
interesting perspective for improving the GNIP formulation. A processing 
network is one for which the flows on arcs going out from (or into) a 
given node are proportional to each other. To achieve this proportionality 
(in our case, we desire equal flows), a network with side constraints 
must be solved. Several authors have reported some success with solving 
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pure and generalized networks with a "few" complicating constraints 
[Refs. 64, 65, 66]. Unfortunately, the number of side constraints in 
this case grows again with the number of non-zero elements in each column 
of the ILP. In fact, so many side constraints are needed in these 
problems that the size of the basis inverse representation which must be 
carried along with the network becomes prohibitive. 

Because the side constraint portion is so large, it was decided to 
view the generalized process network problem as a candidate for either 
generalized upper bounding factorization or network factorization routines 
embedded within an LP system. Generalized Upper Bounding (GUB) refers to 
a set of rows with at most one non-zero in each column. The coefficients 
Of each non-zero must be +l or -1 (or capable of being scaled to +l, or -l). 
Network factorization refers to a set of rows with at most two non-zeros 
in each column, and the non-zeros may be of any value or sign. Brown and 
Wright [Ref. 67] have examined techniques for extracting network structures 
embedded in general LP problems, but the test bed for network factorization 
routines coupled with an Integer Programming System is not yet in place, 

SO no results can be reported at this time. A Generalized Upper 3o0unding 
factorization routine was available, however, and a formulation of the 


GNIP-2 process network scaled for GUB appears below. 


(14) MIN CX, 
k 
ies Sel. X, + SY, = 5, 7 Soy woe. We OR 
k:head j ‘ en j ‘ 
(16) SS epee MY, = 0, i = 12°. cone 


VED, 
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(17) YE - re =e 1 basil aes ee enlame ik: 
KeGali 1 Kessel lied 


an 


where My head j = /Mkshead j - 


(14) is the same as for the GNIP-2. (15) forms a GUB set, the right-hand 
side of (16) would be replaced by > D. for the SCP, and (17) is the 
Side constraint section. 

The number of GUB rows attainable is equal to the number of variable 
nodes in the GNIP. This is an enormous GUB set, but even with the GUB 
rows not considered, the problem is still larger than the original ILP. 
It was predicted that the basis inverse representation obtained from this 
factorization would not be as ill-conditioned as the representation 
obtained from the normal LP bases. However, the LP's with GUB factoriza- 
tion are also very difficult to solve and the results do not indicate 


that this promises to be a competitive approach. 
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VI. COMPUTATIONAL EXPERIENCE 


A. THE COMPUTER “CODE 

The computer code used for this research is a large-scale optimization 
test bed called the X System or simply XS. XS has been developed since 
1974 [Ref. 62] as a general-purpose optimization system of advanced 
design which serves both as a prototype test bed for research and as the 
fundamental computational foundation of many application packages utilizing 
optimization. XS is designed to solve large-scale optimization problems, 
with special emphasis on mixed integer models. The embedded linear 
programming module has received most of the design effort and exhibits 
many singular features including: 

1. Hyper-sparse data representation [Ref. 68]; 

2. Complete constructive degeneracy resolution [Ref. 57]; 
3. Basis factorization [Ref. 69]; and 

4. Elastic range constraints [Ref. 62]. 

XS consists completely of open FORTRAN subroutines. FORTRAN IV 4 
(Extended) OPTIMIZE (2) is the implementation dialect and an IBM 3033 is 
the host computer. All of the problems with the exception of AMERICAN 
were solved interactively under the IBM CMS timesharing system. 

1. Elastic Model 

XS requires that the model be thought of in an extended or 
"elastic" sense. The term elastic comes from the view that no constraint 
is totally binding, var may be violated at a price, an elastic penalty. 


The feasible region is thereby "stretched" to the degree of elasticity 
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specified by the penalty structure. The extended elastic model appears 


below. 
n m on 
Minimize 2s CX; 4 L (P.S, - PS.) 
jill j=] 
- - : ~ + 
See Ri. 2 Se 8 - Sigk3 SRG + S,, Viglen eee, MOQ 
j=l 
- - 4 7 ~ 
Bree yp Se 2k, SR, +S, i sete 15)... ane 
jee 
ee . ; = 
() Sas J oe, » As 
= ~ . 
5; > 0, 5S; > 0, le= 1, Granesgeo ill. 
where C. ; Cost Coefficients, 
ass : Constraint Coefficients, 
P. and P : Lower and Upper Constraint Violation Penalties, 
R. and a ; Lower and Upper Constraint Range Limits, 
S; and , ; Logical “artificial” and "surplus" variables, 
x : Variables (any of which may be integer), 
L, and U5 : Lower and Upper Variable Bounds, 
35: sOnwor +1. or =) (GUB indicators). 
ma ;: Number of GUB rows, 
m > Row Dimension, 
n : Column Dimension. 


¢. “aemeuspagse Data Representation 


Appendix 8 exhibits a specimen of the data input format used for 
this research. This particular form exploits the hypersparse data 
Structure capability of XS since there are very few unique real numbers 
in a SPP/SCP. All of the constraint coefficients are 1 and 0; therefore, 
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it is not necessary to store a real number for each non-zero coefficient, 
but merely its address. Further efficiencies can be realized in a similar 
manner for the unit-cost objective function, and for the right-hand side 
for which b. = ifor all i. 

3. Primal Dual Algorithm and Degeneracy Resolution 

The representation of the Basis Inverse maintained by XS admits 
the application of the Primal or Dual Algorithm with equal facility. 

This fact makes it possible to use the Dual Algorithm for this problem 
class with no loss of performance with respect to the Primal. As mentioned 
earlier, the Dual is the algorithm of choice because of the massive 

primal degeneracy present in this class of problems. 

It is this equal facility between the Primal and the Dual which 
provides the framework for the degeneracy resolution machinery in XS. In 
Graves' terminology [Ref. 57], when degeneracy is encountered, the 
algorithm is said to be "blocked." The resolution of blocking in either 
the primal or dual algorithm is accomplished by shifting to the alternate 
algorithm when blocking occurs. The alternate algorithm is applied to a 
Subproblem of the original problem and at worst we are led to a contractina 
sequence of problems to which we alternately apply the primal and dual 
algorithms. A strict contraction can be assured, and thus in at most a 
Tinite number of steps, resolution is assured. A complete illustration 
of blocking resolution can be found in [Ref. 70]. 

The degree to which blocking is resolved through this sequential 
nesting of subproblems is controlled by a blocking resolution parameter. 
This parameter can be set so that any degree between no resolution and 


total resolution can be obtained. The parameter also controls the point 
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at which blocking resolution begins in the solution trajectory. This 
means that blocking resolution can be inhibited early in the solution 
trajectory when it may not be efficient to resolve every degeneracy, and 


then enabled as the trajectory nears optimality. 


B. HEURISTICS WITHIN THE EXACT ALGORITHM 

The efficiency of branch and bound algorithms can be improved through 
judicious use of heuristic information that indicates good branching 
Strategies to follow. If good feasible solutions (incumbents) can be 
obtained early, then fathoming can occur more quickly and more frequently 
in the search. Also, premature termination of the algorithm will more 
often result in near-optimal or optimal solutions. 

ieee Elastic Heuristic 

A robust technique for obtaining an incumbent has been incorporated 

into the XS enumeration system. Any continuous relaxation of the Elastic 
ILP can be rounded to an integer solution with very little computational 
effort. Further, all such rounded solutions are admissable (feasible in 
the extended elastic sense). The current continuous solution is rounded 
in three passes, each of which selects variables from a class defined in 


terms of 9, where 0 < 9 < .5 and (1 - @) < X, 


j < lor 0 < xX, 


< @. 
es 


Class 1: Nearly integral (0 < 6 < .2). 

Oiase 2. Fractional (Cee 

Class 3: Ambivalent (24 <8 <CEiNon 
The rounding heuristic sequentially exhausts variables from each class 
and rounds using a “minimal regret function," rounding away from the 


worst penalty. 
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In the default elastic enumeration scheme, there are only depth 
and value-motivated fathoming rules. Feasibility plays no direct role 
in the enumeration except via the accumulated cost of the penalties in 
the objective function. Integer solutions and lower bounds of excellent 
quality are empirically produced quite early in the enumeration effort, 
permitting routine early termination of the search based on an optimality 
tolerance or on a maximum depth (permissible number of fixed variables in 
any restriction). Tuning of the method is easily accomplished via these 
two limits and the elastic penalties used to express the underlying model. 
The penalty structure found most effective with this heuristic is 
based upon the number of non-zero row elements in each row of the constraint 
matrix, called NRE(i). It has also been determined that the upper penalty 
P* should be set at one-half the value of the lower penalty P , in 
order to coerce the heuristic to round up. This forces shallow termina- 
tion with respect to depth fathoming. A penalty constant, P, is set at 
approximately one order of magnitude greater than the largest cost 


coefficient, so that for each row in the SPP/SCP 


Set Pp. = P/NRE(i). 
Be. bk 67 
Set i. = x P.. 


Penalties set in this manner “communicate” to the heuristic 
that rows which can be covered in only a few ways are more important than 
rows with a higher row count. In the enumeration, then, these “important” 
rows are satisfied first, making it possible to avoid many of the alternate 
possibilities available for covering rows with a large row count. 

Table 6 exhibits the computational characteristics of the Elastic 
Method. LP and LPtime indicate the solution value and solution time (in 
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CPU seconds) for the first continuous relaxation of the indicated problems. 
OPT is the optimal integer solution, and ILPtime is the total CPU time in 
seconds required to solve the ILP to the optimum. INPUT/OUTPUT time is 
included in the ILPtime values. Input time includes the time to read in 
the entire problem and accomplish error checking. Output time includes 
the printing of intermediate information plus the time required to 


execute the report writer. The dual algorithm was used for all solutions. 


TABLE 6. COMPUTATIONAL RESULTS FOR THE ELASTIC METHOD 


ie LPtime OPT ILPtime 
STEINER1 O50 0.08 18 25.13 
MCOVER1 0.0 (9) 0.05 QO (18) Oasis: 
STEINER2 15.0 Wee 30 527 200 
MCOVER2 O05 (915)) U5 2 Oneo) 417.08 
TIGER] 56406 .0 0.94 56406 0.97 
TIGER2 15098.0 9.46 15098 oS 
TIGER2Za 15684.0 Ones 15684 10.30 
BUS 50754.5 8.18 61308 177.44 
AMERICAN No LP solution after 30 minutes CPU time 
TRUCK No LP solution after 30 minutes CPU time 
TANKER 75941 .4 34.05 75941 .4 44.62 


It is interesting to note that the maximal set covering reformula- 
tions of the two STEINER problems as MCOVER1 and MCOVER2 produced easier 
ILP'S. The budget constraints for these problems were constructed so 
that the maximal covering problems would seek the same optima as the SCP 
formulations. TIGER2a is a variant of TIGER2 obtained by eliminating 
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some of the columns found in the optimal solution of TIGER2. Marsten has 
indicated in [Ref. 23] that problems in this general class typically have 
many nearly optimal integer solutions. TIGER2a supports this observation: 
8 of the 33 optimal columns for TIGER2 were removed to construct TIGER2a, 


and the solution is only 3.9 percent worse. 


C. THE ELASTIC METHOD WITH STARTING SOLUTIONS 

As indicated in Table 6, the LP relaxations for AMERICAN and TRUCK 
were not solved within 30 minutes. After many unsuccessful attempts to 
overcome the numerical instabilities peculiar to these LP's, various 
combinatorial and heuristic methods for obtaining starting solutions were 
considered. Given a feasible, suboptimal solution of "reasonable" quality, 
it should be possible for the elastic method to find the optimal solution 


in few enough iterations to avoid many of the numerical difficulties. 


D. LOGICAL REDUCTION 

Garfinkel and Nemhauser [Ref. 71] have given a set of simple rules 
for logically reducing a problem matrix for the SPP/SCP with D. =e On 
all i. Although logical reduction is not guaranteed to provide a starting 
Solution, substantial reductions in problem size can greatly improve the 
numerical behavior of many problems, especially if the problems were 
Originally generated with many inherent redundancies. This explanation 
of logical reduction provides insight into the structure of the SPP/SCP 
and is valuable for understanding and evaluating some of the heuristics 
used for constructing starting solutions. 

Not all of the rules were chosen for implementation because it is 


felt that their inclusion does not return sufficient reduction to justify 
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the computational expense of using them. The implementation scheme used 
is a non-backtracking (polynomial time) routine involving easy binary 
comparisons of rows and columns. If significant reduction is achieved 
after one application, the scheme is applied iteratively until no more 
improvement is obtained. Those rules which were implemented are based on 
the notions of row and column dominance. For example, if two columns are 
equal element-by-element (ColA = ColB), and one has a cheaper cost, then 
the more expensive column is dominated by the cheaper and may be deleted. 
Not so obviously, if RowA is wholly contained as a subset of RowB 
(RowA < RowB), then RowB may be deleted, since any column which covers 
RowA will also cover RowB. The four rules used are: 
Rule 1: Delete all null columns and null rows. 
Rule 2: Column Dominance. 
A. SCP: If ColA > ColB, and CostA < CostB, delete ColB. 
B. SPP: If ColA = ColB, delete the more expensive column. 
Rule 3: Row Singleton. 
A. Delete the row covered by only one column. 
B. Fix the variable associated with the singleton to one. 
C. Delete all rows covered by the fixed variable. 
D. SPP: Delete all columns in the rows deleted by 3C. 
Rule 4: Row Dominance. 
A. If RowA > RowB, delete RowA. 


B. SPP: If RowA is deleted, also delete every column in RowA 
which is not included in RowB. 


The column and row reduction schemes can be applied iteratively, since 


after the first application, additional dominance may be discovered. 
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Consider the constraint matrix of the Air Freight Example below. 


Application of the reduction rules would achieve the following results: 


Ri R2 R3 R4 R5 R6 R7 


Los Angeles ! si 0 0 0 0 0 0 
San Francisco : 1 1 1 0 0 0 0 : 
San Jose : l l l 0 0 0 0 : 
Denver : 0 0 il l l 0 0 . 
Portland : 0 0 0 1 l 0 : 
Seattle : 0 0 0 Bt l l | 
San Diego 0 0 0 0 1 l l 
Costs lo 0 0 0 6 7 4 | 


For the SPP, 

Rule 3: A. Delete Los Angeles. 
Depo! tOvomemand delete It. 
C. Delete San Francisco and San Jose. 
D. Delete R2 and R3. 

Rule 4: A. Denver > Portland, deiete Denver 

Seattle > Portland, delete Seattle 

B. Delete R6. 


The resulting reduced problem is 


R4 R5 R/ 


| | 
Portland asl l QO | 
| | 
San Diego | 9 1 1 | Solution = Rl, R24, R7. 
| | 
| | 
Costs | 0 6 4 | 
| 
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The action of deleting a row or column does not necessarily mean 
that the row or column has been eliminated from the problem. In the 
reduced air freight example, the removal of Denver and Seattle from 
consideration means simply that any column which covers Portland will 
automatically cover Denver and Seattle; therefore, Portland jis the 
critical row. The same observation holds for San Diego. Table 7 exhibits 
the degree of reduction achieved and the computation times for two of the 
test problems. % RED is derived by dividing the number of rows/columns 
deleted by the number of original rows/columns, respectively. No reduction 
was achieved for STEINER], STEINER2, BUS, TIGER2, TANKER, and TRUCK. 
TIME indicates the total time in CPU seconds required to achieve the 


indicated reduction. 


TABLE 7. LOGICAL REDUCTION RESULTS FOR SELECTED PROBLEMS 


ROWS % RED COLS % RED ITERATIONS ~—‘ TIME 
| | | | | | | 
Mice | WoO | 50 | 636 {| 13.8 | La | eee 
: | | | | | : 
AMERICAN | oo Oi 331s eee Cee lL | 1743.0 | 
| | 


The tremendous column reduction achieved on AMERICAN is due to the 
absence of crew base constraints. During the original generation of this 
problem, entire sets of columns were replicated and were designed to de 
kept distinct by the mutually exclusive crew base constraints. Unfortu- 
nately, the original crew base constraints are no longer available. Once 
the reduction for AMERICAN is explained, then, the benefits obtainable 


by logical reduction do not appear to justify its computational expense. 
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The reduction routine could be a valuable aid, however, in validating 
the performance of column generation programs, and could also be of use 


in the first screening of problem data. 


E. A GREEDY HEURISTIC ALGORITHM 
Baker [Ref. 72] describes a heuristic algorithm developed to exploit 
the structure of large airline crew scheduling problems formulated as SCP's. 
The approach is to successively augment the solution set for the SCP by 
selecting columns which exhibit the minimum average cost per uncovered row. 
STEP 1: Initialization. Solution set = @. Row Coverage Set = @. 


STEP 2: Selection. Choose the column X* that has the minimum 
average cost per uncovered row. 


STEP 3: Update. Add X* to the solution set. Update the row 
coverage set to reflect the rows covered by xe If all 
rows are covered, STOP. Otherwise GO TO STEP 2. 


The worst case bound for the solution obtained from this procedure is 
reported by Baker to be: 


E 
SOLN(Heuristic) < SOLN(Opt) 20 1/k , 
k=1 


where — is the maximum number of non-zero elements in any column in the 
solution set. This means that for a set of columns with from four to 

ten non-zero elements, the worst solution obtainable from the heuristic 
is from two to three times larger than the value of the optimum. Table 8 
indicates, however, that the actual performance of the heuristic can be 
much better than the worst case bound. The column labeled START TIME 
indicates the time required to obtain the starting solution. “OPT is 

the percentage difference between the starting solution and the optimal 
solution. This simple heuristic will provide a classically feasible 


solution of reasonable quality for the SCP. The solution for SPP 
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will not be feasible, since the heuristic will treat the SPP as a SCP and 


overcovering of the rows will result. 


TABLE 8. STARTING SOLUTIONS OBTAINED FROM THE BAKER SCP HEURISTIC 


START 

TIME % OPT 
STEINER] Oey, 320 
STEINER2 O06 Bis 
TRUCK 8.47 9.19 


Because these starting solutions are of limited value for set parti- 
tioning problems, this line of study was terminated in favor of a new 
solution technique developed by Brown and Graves [Ref. 73]. The new 
technique uses a block partitioning scheme to exploit the intrinsic 


Structure of the SPP/SCP. 


F. BLOCK PARTITIONING ALGORITHM 

Christofides [Ref. 3] describes a block partitioning structure 
attributed to Pierce [Ref. 15] which has been used by many researchers for 
this problem class. To place the SPP/SCP in block form, we make up m 
blocks of columns, one block for each row. Block i will comprise of 
exactly those columns which cover row i, but do not cover rows 1 to i-l. 
This produces a staircase matrix with zeros to the right of the staircase. 
The blocks in general can be arranged in tableau form as shown in Fiqure 
3, although one or more blocks may be nonexistent tn a particular problem. 

Marsten [Ref. 1] determined experimentally that sorting the rows by 
increasing length gave consistently good results for his algorithm which 
favors the shorter rows for early branching. The row with the fewest 
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Block 1 | Block 2 | Block 3 : Block 4 | - | Blac 


| | 

| 
Row l | ll | 0 : 
Row 2 | aol) 0 | | | | 
Row 3 | | (eee 0 | | | 
Row 4 | Oorl | Sa eee ©: ey ec | 
- | | Oorl | | | | | 
{| | | Oorl | | | | 
| | | mene ten | 0 | 
Row m | | | | | Pe | 


Figure 3. Block Partition Structure 


1's is placed at the top, and the row with the most 1's ends up at 
the bottom. (This ordering by row length is also depicted in Figure 3.) 
Intuitively, it seems reasonable that a row which can be covered in only 
a few ways is more critical than a row which can be covered in many ways, 
and should therefore be dealt with first. This row ordering scheme was 
chosen for implementation. 

Once the problem has been placed into the block structure, three ways 
of ordering columns within blocks are found in the literature: (1) 
heuristically by increasing or decreasing cost [Ref. 3]; (2) lexicographic- 
ally [Ref. 1]; and (3) randomly (i.e., columns are not explicitly reordered 
once blocking has been accomplished). The algorithm developed by Brown 
and Graves does not presently require that columns be specially ordered 
within blocks. 

The Block Partitioning Algorithm can be applied to both the initial 
LP Relaxation and subsequently to the integer enumeration. For the LP, 
the problem is first divided into an arbitrary number of block groups 


forming a set of distinct LP subproblems. The first LP subproblem (LP, ) 
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is solved to optimality, the next LP subproblem (LP,) is dynamically 
appended to the first, and then the two subproblems are solved as one. 
The third LP subproblem is appended to the solution of LPi 2 and the 
procedure continues until all subproblems have been appended and a global 
solution has been obtained. 

Many variations of this procedure are evident. TRUCK has been solved 
by dual relaxation of the aggregation of successive LP solutions. 
Particular problems exhibit great sensitivity to tuning of this procedure. 
In particular, a few complicating columns are frequently the principal 
cause of computational difficulty. 

Table 9 presents results for the Block Partitioning Algorithm applied 
to the LP only. Subsequent to the solution of the global LP, the elastic 
enumeration scheme was used to obtain optimality. OPT TIME indicates the 
total time required to achieve optimality. 1/0 time is included in the 


values. All times are in IBM 3033 CPU seconds. 


TABLE 9. RESULTS FOR THE BLOCK PARTITIONING ALGORITHM 


NUMBER OF BLOCK NUMBER OF LP GLOBAL OPT 
BLOCKS TIME SUBPROBLEMS LP Time TIME 
BUS 48 0.001 4 10.76 174.50 
EGE RZ 106 0.04 4 10.98 eye cis 
TRUCK 194 O15 + 20) oS lated Si Oia 
TANKER 50 eo1l “ oS AEE Ne 
AMERICAN ps 0.24 4 Mee 2 536.41 


*Primal Feasible, Suboptimal solutions. 
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The analogous procedure whereby each subproblem is solved to an 
integer optimum did not compare favorably with the default elastic 
method. This line of study was therefore terminated and another concept 


called the "Refinement Procedure" was investigated. 


G. COLUMN GENERATION AND PROBLEM REFINEMENT 

There is no substitute for possessing the column generating program 
when attempting to solve the large-scale SPP/SCP. Attempting to solve 
large, static problems in a vacuum is doomed to be either expensive or 
impossible. The column generator and the optimization system work best on 
these problems when they are intimately coupled so that each module can 
communicate with the other. In this way, the optimizer works on smaller 
problems and the column generator produces only those columns which can 
contribute to a better solution. 

Graves [Ref. 73] has developed a refinement procedure for the SPP 
which attempts to capture, for a static problem, some of the capabilities 
wnich are present when the column generator is in hand. this procedure 
results in a relaxation of the original probiem, but it is a workable 
scheme which can produce acceptable solutions. The procedure is 
implemented as follows: 


STEP 1: Solve the SPP as a SCP. Identify rows with multiple covers. [7 
no multiple covers exist, STOP. 


STEP 2: For each column covering a row which is multiply covered, 
generate a new column which does not include rows with multiple 
covers. Original columns are assigned a “cost per row covered" 
wnich is used to give new columns reduced costs proportionate to 
the number of rows deleted. Go to STEP 1. 

Table 10 displays the performance of the refinement procedure on two o7 


the more difficult SPP's, 3US and AMERICAN. The refinement procedure is 
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used in conjunction with the Block Partitioning Algorithm set for four LP 
subproblems. # REFINEMENTS gives the number of refinement iterations, # 
COLUMNS GENERATED gives the total number of new columns generated by the 
procedure, and OPT TIME is the total time in CPU seconds required to 


achieve the optimal partition. 


TABLE 10. RESULTS FOR THE REFINEMENT PROCEDURE 
# REFINEMENTS # COLUMNS GENERATED OPT TIME 


BUS 5 25 12206 
AMERICAN 2 26 94.60 


Comparing the results from Tables 9 and 10, it is obvious that the 
refinement procedure produces a solution much more quickly than the other 
methods. It is difficult to compare the solution values, because the 
true cost for each column generated by the procedure is not known. ihe 
costs assigned here to the new columns are representative, however, of 


those which would be assigned by the column generator. 
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VIII. CONCLUSIONS AND RECOMMENDATIONS 


It has been shown that practical, large-scale set covering and set 
partitioning problems can be solved optimally and efficiently. The Block 
Partitioning Algorithm is clearly the most robust and most successful 
technique examined in this study, and its efficiency compares favorably 
with published solution technologies for this problem class. 

Unfortunately, the implementation of the Generalized Network Reformula- 
tion for the SPP/SCP did not perform as well as expected. The continuous 
relaxation of the Integer Generalized Network is too weak to be of 
much practical use; therefore, this technique does not hold much promise 
for the rapid solution of set covering and set partitioning problems. 

Much work remains in improving the integer enumeration scheme subse- 
quent to the solution of the linear programming relaxation. The default 
elastic method works well, but additional research is needed to improve 
its performance. The coupling of the column generating program with the 
optimizer 1s a concept which holds great promise for the efficient 
solution of problems in this class. As illustrated by the Refinement 
Procedure results, spectacular reductions in solution time can result 
from implementing this idea. 

The proposed standard data input format displayed in Appendix 8 makes 
data manipulation both easy and convenient, and dramatically reduces 
Storage requirements for any mathematical programming system capable of 
exploiting it. A tape containing all of the test problems in this format 


is available to those who wish to continue research in this area. 
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APPENDIX A. DESCRIPTION OF SELECTED APPLICATIONS 


A, POLITICAL DISTRICTING (SPP) 

Let the rows represent m basic population units (such as counties, 
census tracts, etc.). Let the columns represent n possible districts or 
subsets of the population units such that each potential district meets 
the requirements on population size, contiguity, compactness, and so 
forth. A side cardinality condition (9) usually imposed is that there be 
exactly J districts. If C; 1s some ordinal measure of the unaccept- 
ability of district j, then an optimal solution to the SPP yields an 


optimal districting plan. 


B. COLORING PROBLEMS (SPP) 

Consider the problem of coloring a map so that no two adjacent areas 
have the same color. Let there be m such areas. A column j is generated 
if no two elements of column j correspond to areas having a common 
boundary. If all costs are unity, an optimal partition indicates the 
minimum number of colors needed. A direct application of this concept is 
the problem of minimizing the number of distinct radio frequencies 
necessary to provide service in several geographical areas. A column j 
is generated if no two elements of column j correspond to areas with 


overlapping frequencies. 
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C. NUCLEAR AND CONVENTIONAL TARGETING 
1. Conventional Scenario (SCP) 

Let each row 1 represent a target which must be engaged at 
least D. times. Let each column j represent a weapons system capable 
of engaging some subset of the m targets within a specified time period. 
If the cost coefficients reflect the expected effectiveness of a given 
weapons system on the targets covered by column j, the optimal solution 
will yield the most effective subset of weapons systems capable of 
accomplishing the mission. If columns are generated so that k missions 
are possible for each of p weapons systems, then a constraint will be 
necessary to ensure that each weapon system is given only one mission in 
the optimal solution. The maximal SCP formulation can also be used here 
to find the combination of weapons systems which can engage some specified 
proportion of targets. 

2. Nuclear Scenario (SPP) 

Let each row represent a target which must be engaged only once 
in a given time period (to avoid fratricide, for instance). Let each 
column j represent a weapons system capable of engaging some subset orf 
the m targets (i.e., various footprint alternatives). If a unit-cost 
objective function is used, the optimal solution will yield the minimum 


number of weapons systems needed to destroy all the targets. 


D. INFORMATION RETRIEVAL (SCP) 

Consider the problem of retrieving information from n files, where 
the ge file is of length C.. Suppose that m requests for information 
are received. Each unit of information is stored in at least one file 
j indicated by ee 1. An optimal cover yields a subset of files that 


J 
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minimizes the maximum total length which needs to be searched in order to 


guarantee retrieval of all the information. 


E. CYCLIC SCHEDULING (SCP) 

A fundamental problem of cyclic staffing is to size and schedule a 
minimum cost workforce so that sufficient workers are on duty during each 
time period. The k,m cyclic scheduling problem models the task of 
finding the minimum cost assignment of workers to shifts so that each 
person works k time periods consecutively out of m, and at least b. 
workers are present during the day 1. A sample tableau for the 5,7 


cyclic scheduling problems is shown below. 


Xy Xo X3 Xa Xe Xe Xo RHS 


MON 1 0 0 1 1 1 1 pub, 
TUE 1 1 0 0 1 1 1 > by 
WED 1 1 1 0 0 1 1 23, 
THU 1 1 1 1 0 0 1 Dy 
FRI 1 1 1 1 1 0 0 > de 
SAT 0 1 i 1 1 1 0 > dg 
SUN 0 0 1 1 1 1 1 ue 


OS Ris Us and Integer for all j 


The above formulation is not a binary program; therefore, to transform 
it into one, two alternative techniques can be used. If it is desired to 
distinguish between individual workers, the above seven columns can be 
replicated for each worker. An additional side constraint will be 


necessary to ensure that a worker is not selected to work more than one 
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shift. This approach could result in a large number of columns, so 
another alternative is to use a binary representation in place of each 
x; For any reasonable value of U., this is a feasible technique. 

For large values of ons the requirement that the x be integer is 
probably not worth the computational expense, and the problem should be 
solved as a continuous LP. Additional considerations such as overtime, 


days-off scheduling, part-time workers, over- and under-staffina, etc., 


are discussed in (Ref. 51]. 


Ge ALES TERRITORY DESIGN (SPP) 

A problem facing sales managers is how to identify which customers 
Should be included in a given sales territory, and how to determine the 
best call frequencies for individual customers, in short, how to allocate 
a given amount of the time of several salesmen to several hundred prospec- 
tive customers so as to maximize sales. Let the rows be customers. Let 
the columns represent p sets of candidate territories, one for each of 
the p salesmen. Let the costs reflect the potential sales response 
evaluations for a particular salesman in territory j. A side constraint 
is necessary to ensure that only one territory is odicked from each of the 
p sets. The requirement that a customer can appear in one and only one 
Sales territory makes this the SPP. 

The generation of candidate territories is a difficult process in 
itself. Shanker, et al. [Ref. 8] suggest a procedure which involves 
solving a series of integer programs. One ILP selects territories which 
maximize demand potential subject to a series of workload, stratification, 


and compactness constraints. This set of territories is in turn evaluated 








in another ILP which maximizes a piecewise linear response function 
Subject to calling frequency constraints. Subjective considerations can 
be included at various points in the process to help further reduce the 


number of candidate territories which finally appear in the SPP. 
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APPENDIX B. PROPOSED DATA INPUT FORMAT 


A very large part of the time invested in this research has been 


Spent manipulating and entering problem data which came to the author in 


many different forms. The sheer size of tne data sets made them extremely 


unwieldy; therefore, it was decided early on that a compact format for 


these problems could make data manipulation both easy and convenient, and 


would encourage other researchers to adopt this format as a standard. 


Tne format chosen has many advantages for large-scale problems. 


ihe 


It is compact, listing only problem dimensions, constraint ranges, 
cost coefficients, and coefficient addresses. Tnis not only 
reduces Input/Output time, but makes it possible to handle quite 
large data sets under interactive, time-sharing systems such as IBM 
CMS. 


Storage requirements are easily calculated. Problem dimensions 

are known immediately after reading the first card image. This 
eliminates the need to make multiple passes of the data, or to quess 
at the problem size, as is the case with MPS format [Ref. 74]. 


Data Generation Programs are simplified. Row and column labels are 
accommodated, but they are not primary keys, thus avoiding aloha- 
numeric manipulations with symbol tables. 


Column manipulation of data input is made easy since ail informaticn 
for each column is contiguous. 


This column format is easily generated by commercially available (MPS) 
oroblem generation systems. 


The data input format consists of three sets of card images: 


Ihe 


Problem Dimensions. format (316) (One Card) 


a. Number of Rows 


b. WN = Number of Columns 


c. NZEL = Number of non-zero Elements. 
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Lis 


oe 


Constraint Ranges. Format (2A4, 2£16.8) (M Cards) 


a. 
D. 


Cc. 


IR Row Index 


RL = Lower Range Limit 


RU = Upper Range Limit. 


Column Data. (N or More Cards) 


d. 


The number of cards needed depends upon the number of non-zero 
elements in each column (= NCE). The format for the first 
Govunn card Ws (2Ade Fl4s5, 1015). 

1. JC = Column Index 


Ze 


Column Cost Coefficient 


j 
3. NCE = Number of Non-Zero elements in the column 
4. IR = Row Addresses of Non-zero Coefficients. 


If NCE is greater than 9, additional column cards are needed 
to hold the row addresses for that column. The format for 
additional column cards is (20x, 1015). 
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INPUT DATA FOR STEINER1 


WABLE 11. 


AN EXAMPLE IN PROPOSED STANDARD FORMAT 
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